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ABSTRACT 

Most RNA viruses infecting mammals and other ver- 
tebrates sliow profound suppression of CpG and 
UpA dinucleotide frequencies. To investigate this 
functionally, mutants of the picornavirus, echovirus 
7 (E7), were constructed with altered CpG and UpA 
compositions in two 1.1-1.3 Kbase regions. Those 
with increased frequencies of CpG and UpA 
showed impaired replication kinetics and higher 
RNA/infectivity ratios compared with wild-type 
virus. Remarkably, mutants with CpGs and UpAs 
removed showed enhanced replication, larger 
plaques and rapidly outcompeted wild-type virus 
on co-infections. Luciferase-expressing E7 sub- 
genomic replicons with CpGs and UpAs removed 
from the reporter gene showed 100-fold greater lu- 
minescence. E7 and mutants were equivalently sen- 
sitive to exogenously added interferon-p, showed no 
evidence for differential recognition by ADAR1 or 
pattern recognition receptors RIG-I, MDA5 or PKR. 
However, kinase inhibitors roscovitine and CI 6 par- 
tially or entirely reversed the attenuated phenotype 
of high CpG and UpA mutants, potentially through 
inhibition of currently uncharacterized pattern rec- 
ognition receptors that respond to RNA compos- 
ition. Generating viruses with enhanced replication 
kinetics has applications in vaccine production and 
reporter gene construction. More fundamentally, the 
findings introduce a new evolutionary paradigm 
where dinucleotide composition of viral genomes 
is subjected to selection pressures independently 
of coding capacity and profoundly influences host- 
pathogen interactions. 



INTRODUCTION 

Studies of RNA viruses provide major insights and func- 
tional understanding of replication mechanisms and host 
cellular interactions in which processes of mutation, 
fitness selection, recombination and sequence drift can 
be directly observed. The small size and necessarily 
compact arrangement of protein-coding sequences and 
replication elements creates a range of constraints on 
sequence change that are frequently both quantitatively 
and qualitatively different from evolutionary selection 
pressures on their eukaryotic and prokaryotic hosts. 
Coding regions of RNA virus genomes frequently 
contain secondary and higher order RNA structures that 
function as points of interaction with viral and host 
cellular proteins and RNA elements such as ribosomes 
(1,2). For example, cw- replicating elements of 
enteroviruses in the 2C-coding region and of hepatitis C 
virus in NS5B are revealed by a marked suppression of 
sequence variability at largely unconstrained synonymous 
coding sites (3,4). Substitutions in base-paired nucleotides 
in the stem loop are selected against if they damage the 
stability of pairing, and changes are often accompanied by 
compensatory changes in the paired residue to restore 
pairing. Many RNA viruses have also evolved additional 
genes to inhibit or divert innate cell defences, typically 
embedded within other viral genes in alternative reading 
frames (5), such as the PB1-F2 protein of influenza A virus 
(6). These interactions place further constraints on virus 
sequence change in coding regions of RNA viruses over 
and above their protein-coding function. 

While these additional RNA-structure- or protein- 
coding-based constraints are relatively weU understood 
in viruses where replication mechanisms are characterized, 
RNA virus genomes additionally evolve under a series of 
additional compositional constraints for which functional 
explanations are currently lacking. RNA viral genomes 
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show an extremely wide range of base compositions; 
among mammalian viruses, G + C contents range from 
as low as 33% (respiratory syncytial virus) to 69.9% 
(rubella virus). Coding regions of RNA viruses frequently 
show highly abnormal codon usage patterns (7); hepatitis 
A virus show highly restricted codon usage [formally 
quantified by measures such as an effective number of 
codons [Enc] of 39.8; (8)] and codon choices different 
from mammahan usage [measured as a codon adaptation 
index of 0.76; (9)]. 

However, one of the most unexpected compositional 
abnormahties, particularly among RNA and small DNA 
viruses infecting mammals and plants, is the markedly 
biased frequencies of certain dinucleotides such as CpG 
and UpA throughout their genomes (10-12). For example, 
poHovirus shows a frequency of CpG dinucleotides in its 
coding region that is only 54% of the expected value 
calculated from its (mononucleotide) composition (based 
on a 23.5% [frequency of G] x 23.4% [frequency of C]). 
UpA is similarly suppressed (74% of expected value) while 
other dinucleotides are over-represented (CpA: 1.32x; 
UpG: 1.35x). The lack of suppression of GpC and ApU 
in RNA viruses (and host genomes) demonstrates that this 
effect is specific to CpG and UpA and does not simply 
reflect evolutionary pressure for regions of greater or 
lesser duplex stability. 

The overlap between dinucleotides creates inter- 
dependencies that complicate analysis of the mutational 
and selection pressures that underlie these biased compos- 
itional frequencies. For example, the methylation- 
associated elimination of CpG dinucleotides through 
C->T transitions in genomic dsDNA creates TpG and 
ApC dinucleotides, a process that potentially accounts 
for their over-representation in vertebrate DNA (10) and 
indirectly depletes TpA (13). We recently developed a 
parameterized Markov method to identify what further 
dinucleotide context-dependent mutations best account for 
compositional patterns in genomic and cytoplasmically 
expressed mRNA sequences of mammals and a range 
of other eukaryotic phyla. This was extended to analyse 
compositional abnormahties of RNA viruses infecting 
mammals and insects (10). As expected, 11- to 14-fold 
greater frequencies of C^T transitions upstream of G 
(depicted C^T,G) than other transitions best modelled 
dinucleotide frequencies in mammahan genomic DNA. 
Additionally, further mutations that eliminated UpA di- 
nucleotides and CpG dinucleotides were observed in the 
subset of sequences expressed as mRNA and entering the 
cytoplasm. Modelling provided evidence for equivalent 
selection against these dinucleotides in mammahan and 
plant RNA viruses that is consistent with their under- 
representation in previous compositional analyses 
(10,11,14). The nature of this apparent selection against 
CpG dinucleotides operating on both host and viral cyto- 
plasmic RNA sequences remains unexplained functionally 
although it has been frequently hypothesized that recog- 
nition of CpG and potentially UpA motifs is part of an as 
yet uncharacterized self-non-self recognition system 
coupled to innate immunity (15). This may be functionally 
and perhaps evolutionarily related to Toll-like receptor 9 



that recognizes non-methylated CpG dinucleotides in 
DNA sequences (14,16). 

Further evidence that the presence of CpG dinucleo- 
tides in viral sequences either activate or are targets of 
cell defence mechanisms is provided by the observation 
that polioviruses with artificially elevated CpG frequencies 
in their genomic RNA were markedly attenuated and 
replicated to titres several orders of magnitude lower 
than wild-type virus in vitro (17). The existence of such 
recognition systems may in turn have placed additional 
selection pressures on host expressed mRNA sequences 
to evade these viral countermeasures. In the current 
study, we have developed a model system based on an 
infectious clone of the enterovirus, echovirus 7 (E7), to 
investigate the effect of modifying dinucleotide frequencies 
on virus replication in mammalian cells. The observation 
that elevated frequencies of both UpA and CpG attenuate 
viral replication while lowering frequencies below those of 
native viral sequences enhances replication provided the 
opportunity to investigate the role of different compo- 
nents of the innate immune system in dinucleotide recog- 
nition and recruitment of antiviral defences. 



MATERIALS AND METHODS 

Cell culture and cell lines 

RNA transcripts of the pT7:E7 infectious cDNA clone of 
the isolate Wallace (accession number AF465516) and 
pRiboE71uc replicon were used to generate E7 viral 
stocks and the E7 replicon used in the study. Both were 
propagated in rhabdomyosarcoma (RD) cells using 
Dulbecco modified Eagle medium (DMEM) with 10% 
fetal calf serum (FCS), penicillin (lOOU/ml) and strepto- 
mycin (100|ig/ml). All cells were maintained at 37°C 
with 5% CO2. Monolayer cultures of A549 cells and 
shRNA cell derivatives were used for the interferon 
pathway analyses while RD cells were used in all other 
experiments. 

In silico design of CpG- and UpA-modified viruses 

Two regions of the full-length E7 cDNA pT7:E7 clone 
were selected for mutagenesis. These lay in regions of 
the genome bounded by the unique restriction sites Sail 
(genome position 1878) and Hpal (genome position 3119) 
for Region 1 and EcoRI (genome position 5403) and Bglll 
(genome position 6462) for Region 2. To generate CpG- 
zero mutants (designated lowercase 'c'), all CpG dinucleo- 
tides were eliminated by replacement of either the C or the 
G base with a randomly alternative selected base selected 
to preserve coding of the underlying sequence. A similar 
strategy was used to generate UpA-low (designated 'u') 
and combined zero CpG and low UpA mutants Ccu'), 
with the restriction that UpAp(C or U) codons encoding 
tyrosine precluded elimination of aU UpA dinucleotides. 
Introduction of as many as possible CpG or UpA di- 
nucleotides while preserving coding was used to generate 
CpG-high and UpA-high sequences (uppercase 'C and 
'U', respectively). Sequence changes and their effects on 
base compositions of the resulting insert sequences are 
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Table 1. Composition of e7 and luciferase insert sequences 



Region Sequence Symbol G + C Total freq" CpG total O/E ratio'' freq UpA total O/E ratio'' Codon usage 

content changes (change'') (change) 

CAI Enc CP bias 



Native 


W 


47.6% 


- 


0.041 


51 (-) 


0.730 


0.050 


62 (-) 


0.742 


0.685 


56.5 


-0.047 


Permuted 


P 


47.6% 


142 


0.041 


51 (0) 


0.730 


0.050 


62 (0) 


0.742 


0.694 


55.8 


-0.028 


CpG-zero 


c 


44.3% 


56 


0 


0 (-51) 


0 


0.057 


70 (+8) 


0.741 


0.693 


47.3 


0.042 


UpA-low 


u 


50.9% 


43 


0.045 


56 (+5) 


0.730 


0.015 


19 (-43) 


0.256 


0.672 


51.9 


-0.003 


Both-low 


cu 


47.5% 


102 


0 


0 (-51) 


0 


0.015 


19 (-43) 


0.227 


0.686 


43.6 


0.083 


CpG-high 


C 


56.5% 


150 


0.146 


180 (+129) 


1.828 


0.042 


52 (-10) 


0.900 


0.666 


44.3 


-0.222 


UpA-high 


U 


40.9% 


119 


0.032 


38 (-12) 


0.756 


0.139 


171 (+109) 


1.593 


0.726 


45.8 


-0.094 


Native 


w 


47.1% 




0.018 


18 (-) 


0.320 


0.047 


48 (-) 


0.695 


0.743 


52.9 


0.023 


Permuted 


p 


47.6% 


109 


0.018 


18 (0) 


0.320 


0.047 


48 (0) 


0.695 


0.739 


48.8 


0.017 


CpG-zero 


c 


45.5% 


21 


0 


0 (-18) 


0 


0.047 


48 (0) 


0.654 


0.739 


50.1 


0.085 


UpA-low 


u 


50.0% 


34 


0.021 


21 (+3) 


0.331 


0.014 


14 (-34) 


0.215 


0.781 


48.7 


0.050 


Both-low 


cu 


48.5% 


55 


0 


0 (-18) 


0 


0.037 


38 (-10) 


0.824 


0.788 


47.1 


0.121 


CpG-high 


C 


56.4% 


124 


0.133 


135 (+116) 


1.667 


0.037 


38 (-10) 


0.824 


0.658 


40.3 


-0.247 


UpA-high 


u 


39.2% 


107 


0.015 


15 (-3) 


0.390 


0.149 


151 (+103) 


1.633 


0.606 


39.0 


-0.048 


Native 


L 


45.3% 




0.063 


103 (-) 


1.242 


0.052 


85 (-) 


0.699 


0.647 


45.0 


-0.103 


Both-low 


I 


43.2% 


176 


0.006 


1 (-102) 


0.013 


0.018 


19 (-66) 


0.145 


0.740 


0.74 


0.118 



"Frequency of dinucleotide in insert region. 

''Total number of CpG and UpA dinucleotides in sequence. Changes in numbers between mutated and original WT sequence are indicated in 
parentheses. 

''Ratio of observed dinucleotide frequency (O) to that expected based on mononucleotide composition (E) i.e. f(CpG)/f(C) * f(G). 
''O/E ratio based on f(UpA)/f(U) * f(A). 



shown ill Table 1. Sequences generated for the study are 
provided in Supplementary Data. 

The codon adaptation index for human codon usage 
was calculated through the website http://genomes.urv. 
es/CAIcal/(9). The Enc (8) and codon pair bias [CPB; 
(18,19)] were calculated using the program Composition 
Scan in the SSE package (20). The CPB value was 
the mean of codon pair scores hsted in Supplementary 
Table SI (19) corresponding to each codon pair in the 
insert sequences. 

RNA structure prediction and sequence variability 

Prototype sequences of each species B enterovirus (http:// 
www.picornaviridae.com/) were scanned for RNA second- 
ary structure using the program Folding Energy Scan in 
the SSE package using 200 base fragments incrementing 
by 152 bases and 50 sequence-order-randomized control 
using the algorithm NDR that preserves dinucleotide 
frequencies of the native sequence (21). Mean folding 
energy difference (MEED) values for each fragment 
were plotted against the mid-point of each fragment to 
localize areas of sequence-order-dependent RNA second- 
ary structure. MFEDs were also similarly calculated 
for the reverse complement of each genome sequence. 
Synonymous sequence variability was determined by 
measurement of mean pairwise distances using the 
program Sequence Scan in the SSE package. 

Clone construction and recovery of mutant viruses 

The full-length E7 cDNA pT7:E7 clone under the control 
of a T7 promoter was used for this study. Mutant E7 
constructs with altered CpG/UpA content were generated 
by ordering custom DNA sequences (GeneArt, Life 
Technologies, Paisley, UK). Sequences were provided in 
standard antibiotic-resistant cloning vectors and were 



cloned into pT7:E7. All clones were sequenced over the 
insert regions prior to further apphcations. To recover the 
mutant viruses with altered CpG/UpA content, assembled 
plasmids were Hnearized using NotI and RNA generated 
using a Mega Script T7 in vitro transcription kit 
(Ambion). lOOng of RNA was transfected into RD cells 
using Lipofectamine 2000 (Invitrogen) according to the 
manufacturer's instructions. The resulting cell lysates 
were used to generate passage 1 stocks by re-infecting 
RD cells. Viral titres were determined by TCID50 titration 
in RD cells. 

Replication phenotype 

RD cells were seeded at 5 x 10^ cells per well in six-well 
plates and subsequently infected with the wild-type (WT) 
or CpG/UpA mutants at an multiplicity of infection 
(MOI) of 0.01 per cell for 1 h before removing the 
inoculum and washing the cells. Samples were then 
withdrawn at given time points (12, 18, 24, 30 and 42 h 
post-infection) and the viral titre determined by measure- 
ment of TCID50S in 96-well format plates. The assay was 
performed in triphcate for each virus. For plaque assays, 
confluent RD cells in 100-mm dishes or 6-well plates were 
inoculated with virus in DMEM and incubated for 1 h at 
37°C with occasional rocking. The inoculum was removed 
and replaced with an overlay consisting of 2% Methocell 
(MC, Sigma) in DMEM. Plates were incubated for 96h at 
37°C, fixed with 3.5% formaldehyde and stained with 
0.1% crystal violet. Plaque sizes were quantified using 
ImageJ software. 

Quantitative real-time polymerase chain reaction 

RNA was isolated from cells using the RNAspin Mini Kit 
(GE Healthcare) or from viral supernatant using 
the QlAanip Viral RNA Mini Kit (Qiagen). Reverse 



4530 Nucleic Acids Research, 2014, Vol. 42, No. 7 



transcription was performed using M-MLV reverse tran- 
scriptase (Promega) and random primers Tlie qPCR reac- 
tions were done using Sensifast SYBR Hi-Rox master mix 
(Bioline) in a Rotorgene-Q cycler (Qiagen) using the 
primers listed in Supplementary Table SI. For E7, a 
standard curve using quantified PCR product was 
carried out in parallel, allowing quantification of viral 
copy number. RNA to infectivity ratio was determined 
by viral load measurements of RNA extracted from 
5000 TCID50 units of each virus. 

siRNA knockdown of PKR 

Double stranded RNA-dependent protein kinase (PKR)- 
specific siRNA (EIF2AK2, Ambion Silencer Select 
vahdated siRNA) was used at a concentration of 33 nM. 
Transfections were performed using Lipofectamine 2000 
(Invitrogen) using 1 |il of lipofectamine per 24 well accord- 
ing to the manufacturer's instructions. Cells were 
incubated for 48 h before further experiments were per- 
formed. Vahdated non-targeted control siRNA was used 
at the same concentration and incubation period. To 
quantify the degree of knockdown of PKR by siRNA, 
equal amounts of protein extracted from ceU lysates 48 h 
after siRNA transfection were blotted and detected 
using a PKR-specific mAB (Ye350, Abeam) foUowed 
by anti-rabbit HRP-conjugated antibodies (SA 1-200, 
Pierce antibodies) and ECL detection. Band densities 
were measured for two independent experiments. PKR 
mRNA expression was quantified at the same time point 
by qPCR using PKR-specific primers. 

Replicon construction and replication kinetics 

To accurately quantify intracellular viral replication, the 
pRiboE71uc sub-genomic replicon plasmid was used. This 
contains a version of the E7 genome in which the struc- 
tural genes (nucleotides 753-3118) are replaced with the 
1704-bp-long firefly luciferase gene. A synthetic version of 
the luciferase gene with its CpG and UpA dinucleotides 
removed while maintaining its coding sequence was 
designed, which also included a CpG- and UpA-low 
72-bp hnker sequence at the 3' containing a SanDI restric- 
tion site. The sequence was cloned into pT7:E7 using the 
unique restriction sites KasI (genome position 781) and 
SanDI (position 3191). To create replicons containing 
the additional Region 2 CpG or UpA low inserts 
(designated 'l|-|c' and 'l|-|u\ with the original clone 
designated as 'L|-|W'), a 3235-bp section of the replicon 
directly downstream of the luciferase gene was excised 
using SanDI and Bglll restriction enzymes. This was 
then replaced with the equivalent sections of the previ- 
ously described c|c or u|u constructs with their modified 
Region 2 sequences. Rephcon plasniids were linearized 
using NotI; RNA were synthesized in vitro using T7 
RNA polymerase (Ambion); and RNA integrity was con- 
firmed using an 2100 Bioanalyser (Agilent) before use. 

Assays were performed by transfecting 50 ng of rephcon 
RNA into RD cells seeded at 3 x 10"^ cells per well in 
96-wefl plates. RNA was transfected at given time points 
(1, 4, 6, 8 and 12 h) before luciferase assays were carried 
out using the Luciferase Assay System (Promega), 



according to the manufacturer's instructions. Cells were 
lysed using the Passive Lysis Buffer, and the cell lysate 
was transferred to opaque 96-weU plates for luminescence 
analysis using the Glomax Multi Detection System 
(Promega). 

Sequencing of individual virus genomes 

Viral RNA was isolated from E7 WT, R1/R2 CpG-high 
or R1/R2 UpA-high virus stocks generated in RD ceUs, 
and cDNA created. Nested primers were designed to 
amphfy a ~500-bp section of the modified Region 1 
(nucleotides 1835-2363) and an unmodified region of 
E7 (nucleotides 3241-3723) using primers listed in 
Supplementary Table SI (Supplementary Data). The 
proofreading enzyme PfuTmho DNA Polymerase 
(Agilent) was used to amplify the two sections from each 
cDNA. The products were purified, cloned into a TA 
vector (pGEM-T easy, Promega) and transformed into 
competent E. coli, generating a separate colony for each 
copy of the original viral cDNA. The 500-bp inserts were 
sequenced using Ml 3 primers. 

Competition assays 

Equal titres of WT and mutant virus (combined 
MOI = 0.01) were apphed simultaneously to RD cells in 
24-well plates. Following CPE, the supernatant was 
frozen, thawed and 200^1 applied to fresh RD ceUs. 
This was continued for up to 10 passages, and performed 
in triplicate for each virus competition. For the pairwise 
competition assay, RD cells were inoculated with paired 
combinations of seven viruses, giving 21 combinations in 
total. Each pairwise assay was carried out in a single well 
and passaged through RD cells 10 times. RNA was 
isolated from the final supernatants; cDNA was generated 
and nested PCR carried out to amphfy either Region 1 or 
Region 2 using primers hsted in Supplementary Table SI. 
The amplified fragment was then subjected to restriction 
endonuclease cleavage to estimate the proportion of each 
sequence (enzymes listed in Supplementary Table S2). 

Early intra-cellular replication kinetics 

For synchronized infections, RD cells were infected at 4°C 
with a total of 2 x 10** genome copies (1000 per ceU) and 
maintained at 4°C for 30 min before being moved to 37°C. 
Cells were washed twice with PBS and then trypsinized 1 h 
or 4h post infection. The cehs were then pelleted and 
washed again in PBS before RNA was isolated and viral 
copy number determined by qRT-PCR. Copy number was 
normalized against the housekeeping gene GAPDH by 
qRT-PCR primers (primers hsted in Supplementary 
Table SI). 

Replication of virus with exogenous interferon-p 

RD cells in 24-well plates were pre-treated before infection 
with 1000 U/ml human interferon-P (IFNP) for 24 h, or 
mock treated with DMEM. Cells were then infected with 
WT or mutant E7 at an MOI of 1 for 1 h before the 
inoculum was removed and replaced with media contain- 
ing the same concentration of IFNp as previously. Eight 
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hours post-infection, RNA was isolated as described 
above. Viral copy number was determined by qRT-PCR 
as described previously. 

Analysis of viral growth following inhibition/knockdown of 
signalling pathway components 

RD cells in 24-well plates were treated 45min before 
infection with 2-aminopurine (2-AP, dissolved in PBS- 
glacial acetic acid 200:1) and C16 (dissolved in DMSO) 
at final concentrations of 5 and 2|.iM, respectively. RD 
cells were treated 2h before infection with the cyclin-de- 
pendent kinase (cdk) inhibitor, roscovitine (dissolved in 
DMSO), at a final concentration of 40|iM. All inhibitor 
experiments used mock-treated cells with their respective 
solvents alone. 

CeUs were infected with WT or mutant variants of E7 at 
MOIs of 0.03-0.1, and viral RNA was isolated from cells 
24 h later. Viral load was quantified using qRT-PCR. Cells 
expressing PKR, retinoic acid-inducible gene-1 (RIG-I) 
or melanoma differentiation associated gene 5 (MDA-5) 
shRNA were used to investigate the effect of PKR, RIG-1 
or MDA5 knockdown on viral growth. shPKR, shRIG-1 
and shMDA5 cell lines, as well as the parental A549 fine, 
were infected with WT or mutant variants of E7 at an 
equal RNA copy number relating to a wild-type MOI of 
1. Viral RNA was isolated and quantified 8h post 
infection. 

Innate and adaptive immune responses PGR array 

RD cells in 24-weU plates were infected with E7 WT or 
mutant viruses at an MOI of 10, or a consistent viral 
genome copy number equating to an MOI of 10 in the 
WT. Sendai virus and Poly I:C, strong inducers of 
IFNp, were separately inoculated as positive controls. 
Viral RNA titres were confirmed in virus stock preps 
using qRT-PCR. Cells were harvested 8h post infection, 
and RNA was isolated. RNA from six biological repli- 
cates was combined, and RNA integrity number was con- 
firmed to be greater than 9.7 for aU samples using a 2100 
Bioanalyzer (Agilent). cDNA was made using the RT^ 
First Strand cDNA Synthesis kit (Qiagen) with 800 ng 
RNA per sample. The relative expression of 84 candidate 
genes was then analysed with pre-made 'Human Innate & 
Adaptive Immune Responses PCR Array' RT^ Profiler 
PCR Arrays (SABiosciences, catalogue number PAHS- 
052), and RT^ SYBR Green Rox Fast Master Mix 
(Qiagen). Cychng conditions were 95°C for lOmin, 
followed by 40 cycles of 95°C for 15 s and 60°C for 30 s. 
Data were normalized and fold changes calculated 
using the PCR Array Data Analysis Web Portal 
(SABiosciences). 

RESULTS 

Strategy for maximizing or minimizing CpG/UpA content 
in mutant viruses 

Like other small RNA viruses, the frequency of CpG di- 
nucleotides in the E7 genome is suppressed relative to the 
expected frequency based on its G + C content, with an 



observed to expected ratio of CpG dinucleotides in the 
coding sequence of E7 of 0.58. Frequencies of UpA di- 
nucleotides are also suppressed in the E7 genome 
(observed to expected ratio of 0.78). 

To investigate whether CpG and UpA dinucleotide 
frequencies influenced the ability of E7 to replicate 
in vitro, we created a series of mutated viruses in which 
frequencies of both nucleotides were changed from their 
native levels. This was achieved by reverse genetics using 
the pT7:E7 infectious clone. RNA transcripts generated 
from a linearized plasmid containing the E7 complete 
genome sequence generate infectious virus for phenotypic 
characterization after transfection into a wide range of 
mammahan cells. 

To select areas for mutagenesis, we sought to avoid 
regions of the genome that contained RNA elements 
required for replication or translation functions of the 
virus, such as the c/'.v-replicating element (CRE) 
embedded in the 2C coding sequence (2). By scanning an 
alignment of complete genome sequences of each of the 
currently described species B serotypes (including the 
pT7:E7 sequence of the infectious clone), an area of 
marked suppression of sequence variabihty co-localized 
in the 2C region with the CRE (Figure 1). Calculation 
of folding energies to detected RNA secondary structure 
in the genome showed prominent regions of structure in 
the 5'UTR, 3'UTR and the CRE. MEED values were 
consistently higher for species B sequence orientated in 
the plus (genomic RNA) orientation. The remainder of 
the genome showed no evidence for consistent RNA struc- 
ture formation (MFED values around zero) (Figure 1). 

The combination of unrestricted synonymous variabil- 
ity and an absence of RNA secondary structure over long 
stretches of the E7 genome provided opportunities for 
altering dinucleotide frequencies without disrupting po- 
tential replication elements embedded within the coding 
sequence. Two genome regions (occupying positions 
1878-3119 and 5403-6462, individually corresponding to 
16.7% and 14.2% of the fuU length genome) were selected 
for mutagenesis based on these criteria. Sequences were 
modified by replacing nucleotides within CpG or UpA 
dinucleotides with alternative bases that preserved 
coding. It was possible to remove all CpG dinucleotides 
from both regions and reduce UpA to frequencies ap- 
proximately one third of wild-type levels (Table 1 ; CpG- 
zero and UpA-low insert sequences). As an alternative 
strategy, to maximize frequencies of these dinucleotides, 
every site that could tolerate the creation of these di- 
nucleotides without changing coding was identified and 
mutated to create sequences with 2.5-3x their naturally 
occurring frequencies (Table 1; CpG-high, UpA-high). 
To ensure that sequence disruption did not damage or 
destroy undetected rephcation element(s) within Region 
1 and 2, sequences were permuted using the algorithm 
CDLR in the SSE sequence package (E7-permuted in 
Table 1). This randomizes the order of codons within 
the sequence while maintaining coding and dinucleotide 
frequencies through swaps between equivalently coding 
triplets in the same upstream and downstream dinucleo- 
tide contexts. All mutated sequences were then synthesized 
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Figure 1. Genome organization of E7 and positions of mutated insert regions. Insert positions are compared with genome diagram and a plot of 
sequence variability within species B enteroviruses at synonymous sites (blue line) and folding energies indicative of RNA secondary structure (red 
and pink lines). Variability at synonymous sites (left y-axis) was computed at each codon position in alignments, plotted with a window size of 41 
codons. MFED values (right y-axis) for sense and antisense RNA sequences were calculated for 200 base fragments, incrementing by 48 bases; values 
plotted represent mean values of five consecutive fragments. Nucleotide positions were calculated relative to the pT7:E7 cDNA sequence. 



and cloned into the pT7:E7 infectious clone using natur- 
ally occurring restriction sites. 

Replicative fitness of mutants with modified 
CpG/UpA frequencies 

Wild-type E7 and mutant viruses were recovered in cell 
culture by transfecting whole-genome RNA sequences 
obtained through T7 transcription of pT7:E7. Recovered 
virus was then titred by measurement of TCID50 values 
and used in subsequent experiments. In the following 
sections and associated figures and tables, wild-type and 
permuted control sequences were designated as W and P; 
high CpG and UpA mutants as U and C and low mutants 
as u and c. Clones in which region 1 was exchanged were 
designated as U|W and C|W for UpA and CpG-high 
sequences respectively; region 2 as W|U and W|C etc. 

RNA copy to infectivity ratios were determined by 
extracting viral RNA from a known infectious titre of 
each virus, and carrying out qRT-PCR quantification 
(Figure 2). The RNA to infectivity ratio of the permuted 
double region mutant, P|P (247 ± 9.2), was similar to that 
of the WT E7 virus, W|W (354 ± 8.0), indicating that the 
process of synonymous nucleotide replacement while 
preserving native dinucleotide frequencies does not affect 
specific infectivity. In contrast, increasing either the CpG 
or UpA dinucleotide frequency dramatically increased the 
RNA to infectivity ratio, with values for the C|C mutant 
approximately 350 times the WT value and U|U approxi- 
mately 20 times higher. In contrast, the RNA to infectivity 
ratios of viruses with reduced CpG and UpA frequencies 



(c|c, u|u and cu|cu) were comparable with those of W|W 
(Figure 2). 

In a multi-step infection using an MOI of 0.01, the 
growth kinetics of the E7 mutants were compared with 
that of the WT. Increasing the CpG or UpA dinucleotide 
frequency caused a severe attenuation of viral replication, 
resulting in a viral output approximately 7000-fold lower 
in C|C than W| W after 24 h, and a 30-fold lower output in 
U|U (Figure 3A andC). Mutant viruses rephcated more 
slowly as well as producing a lower final output of infec- 
tious particles. Increasing dinucleotide frequencies in 
Region 2 was more detrimental to viral replication than 
Region 1, despite its shorter length (Ikb compared with 
1.3 kb), with C|W mutants replicating only 144-fold less 
than wild type at 24 h, compared to nearly 1500-fold less 
in W|C. Amongst the UpA-high mutants, replication was 
actually improved by modifying Rl, with U|W consist- 
ently showing 10-fold greater replication than wild type 
while the R2 mutant replicated similarly to U|U. 
Replication of the (CDLR-)permuted control (P|P) was 
indistinguishable from wild type, confirming that there 
were no critical c«-acting replication elements within the 
modified regions. A similar but less marked pattern of 
replication differences was observed in a single-step reph- 
cation assay where ceUs were infected at an MOI of 10 at 
time zero, including enhanced replication of the U|W 
mutant (Supplementary Figure SI A). 

Mutants with reduced CpG and UpA dinucleotide 
frequencies compared with the WT level accelerated viral 
replication (Figure 3B andD). While replication rates and 
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Figure 2. RNA to infectivity ratios of WT and viruses with modified 
CpG/UpA frequencies. WT and mutant viruses were recovered from 
RD cells and TCID50 litres determined (note log scale). The number of 
viral genome copies was determined through qRT-PCR and compared 
with the infectivity titre. Results are the mean and standard error from 
three separate extractions. RNA/infectivity ratios were additionally 
calculated for C|C and U|U mutants in the presence of CI 6. 



final viral outputs of the CpG-zero (c|c) and UpA-low 
(u|u) mutants were similar to wild-type and permuted 
controls, the double mutant in which both CpG and 
UpA were reduced or eliminated (cu|cu) showed 10-fold 
higher yield than the wild type at both 1 8 and 24 h post in- 
fection. Replication differences were again less marked in 
the less-sensitive single-step assay, with similar replication 
curves between low mutants and controls (Supplementary 
Figure SIB). 

Altering CpG and UpA dinucleotide frequencies also 
influenced plaque sizes of individual infectious centres 
created in RD cell monolayers (Figure 4). R1/R2 CpG 
high and UpA high mutants showed plaques that were 
respectively 48% (SEM ± 5%) and 43% (± 6%) the 
size of WT plaques. Areas of UpA- and CpG-low 
mutants were conversely consistently greater than WT 
(mean values for c|c: 144% (± 19%); u|u: 129% 
(± 16%) and cu|cu: 179% (± 17%) of WT size). In 
contrast, plaques produced by the permuted control, 
P|P, were indistinguishable to those of WT (98 ± 13%). 

To determine whether the low RNA/infectivity ratios 
and reduced replication of CpG- and UpA-high mutants 
was the result of their failure to initiate an infection cycle 
or whether there were later restrictions on the generation 
of infectious virions, RNA copy numbers immediately 
post-entry were compared with those 4h after infection. 



One hour after a synchronous infection with 1000 RNA 
copies per cell (as determined by qRT-PCR), the number 
of intracellular viral genome copies was found to be 
similar between viruses with different dinucleotide com- 
positions, with 42, 19 and 36 RNA copies detected per 
cell in those infected with WT, C|C and U|U mutants, 
respectively (Figure 5). However, 4 h post infection, 
RNA copy numbers of wild-type genome copies increased 
to 2362 per cell, whilst the RNA copy numbers of C|C and 
U|U mutants only increased marginally (58 and 207 RNA 
copies per ceU, respectively). The existence of this early 
impairment has implications in subsequent pathway 
analyses (see below). 

To further investigate the enhancement of replication 
observed in E7 variants with reduced CpG and UpA 
frequencies, the effect of dinucleotide changes on the rep- 
lication of an E7 replicon was investigated. The E7 
replicon comprises a monocistronic construct in which 
structural genes are replaced by a luciferase gene derived 
from the firefly (Photimis pyralis); detection of lumines- 
cence enables genome replication to be readily detected 
and quantified. The luciferase gene in the pRiboE71uc 
1.7 kb luciferase gene revealed a strikingly high frequency 
of CpG (ratio of 1.24; Table 1), consistent with its insect 
origin (20,22,23). This could potentially attenuate its rep- 
lication in mamniahan cell hues. A replacement luciferase 
gene was therefore designed in which the CpG ratio was 
reduced to 0.013 and the UpA ratio to 0.145 (from 0.699), 
in both cases the lowest values achievable without coding/ 
restriction enzyme site changes, through introduction of 
synonymous substitutions. The CpG/UpA-low luciferase 
replicon (designated cu|-|W) was further modified by 
replacing region 2 with CpG-zero and UpA-low inserts 
(designated cu|-|c and cu|-|u respectively; the site of 
Region 1 was occupied by the luciferase gene and 
indicated as a '—'). 

RNA transcripts were transfected into RD cells, and 
luminescence was monitored over a 12-h time-course 
(Figure 6). Compared with the original pRiboE71uc 
replicon, the cu|-|W mutant with the luciferase gene 
replaced showed a 100-fold increase in relative lumines- 
cence as early as 4h post transfection. The rephcation of 
the construct was increased a further 6-fold by replace- 
ment of Region 2 with CpG-zero or UpA-low inserts. 

Fitness comparisons of modified viruses using 
competition assays 

To confirm the differences in replicative abihty of mutants 
with altered CpG and UpA frequencies in a more sensitive 
assay, equal MOls of WT and mutants were co-inoculated 
onto RD ceUs and serially passaged at high MOIs between 
passages. RNA extracted at different passage numbers 
was ampHfied across the modified region and cleaved 
with restriction enzymes that differentiated WT from 
mutant sequences. As expected, CpG- and UpA-high 
mutants (C|C and U|U) were rapidly outcompeted by 
WT virus and were almost eliminated by passage 1, 
and entirely undetectable at passage 5 (data not shown). 
Single region mutants (C|W and W|C) were similarly 
outcompeted by passage 5, consistent with their reduced 
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Figure 3. Replication Itinetics and litres at 24 h of WT and modified viruses infected at a low MOI. RD cells were infected with E7 WT, permuted 
(P|P), CpG- and UpA-high mutants (C|C and U|U; A and C) or CpG- and UpA-low inutants (c|c, u|u, cu|cu; B and D) viruses at an MOI of 0.01. 
Infectious litre of supernatants was quantified at indicated time points by TCID50 (A and B) and mean litres and SEMs at 24 h. Results are the mean 
of three biological replicates. 



fitness in the replication assays (Figure 3A and 
Supplementary Figure SI A). Reflecting the more 
marginal phenotype of UpA-high mutants, W|U was 
eliminated between passage 5 and 10 while U|W 
outcompeted the wild-type virus by passage 10 (Figure 
7). This latter finding confirms evidence for greater repli- 
cation of this mutant compared with WT in multi- and 
single-step replication assays. 

The same assay format was used to further characterize 
the unexpected greater replicative ability of E7 mutants 
with reduced CpG and UpA frequencies; c|c and u|u 
mutants both outcompeted WT showing at least 80% 



prevalence after only five passages. By 15 passages, the 
WT was completely undetectable (data not shown). To 
investigate this phenomenon further, a series of competi- 
tion assays were performed in which WT (W|W), the 
permuted control (P|P) and a range of mutants with dif- 
fering degrees of CpG and/or UpA underrepresented 
(cu|W, W|cu, u|u, c|c and cu|cu) were each competed 
with each other and assayed at passage 6 and 10 
(Figure 8). The cu|cu mutant showed the highest fitness, 
completely outcompeting almost all of the other viruses by 
passage 6. The c|c mutant ranked second, followed 
by cu|W. Lowering CpG/UpA frequency in Region 1 
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Figure 4. Plaque morphology of E7 WT and double region mutant 
viruses. RD cell monolayers in 10-cm plates were infected with a 
similar infectious litre of virus and incubated for 96 h at 37°C. 
(A) Plaque appearance. (B) Plaques sizes of WT and mutant E7 
viruses calculated from 25 plaques for each virus using ImageJ 
software (mean values and SEMs relative to WT control shown as 
bar heights and error bars). 
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Figure 6. Analysis of luciferase expression driven by E7 replicons with 
reduced CpG/UpA frequencies. Replicons were generated with reduced 
CpG/UpA frequencies, based on the backbone pRihoEJIuc replicon, in 
which the structural genes of E7 are replaced by an insect luciferase 
gene. In the cu|-|W replicon the luciferase gene itself was modified to 
minimize both CpG and UpA frequency; in the cu|-|c and cu|-|u rep- 
licons Region 2 was additionally modified to further reduce either CpG 
or UpA frequency. RNA was generated from replicons and trans- 
fected into RD cells. Luminescence was measured relative to the 
mock-transfected control. Results are the mean and standard error of 
three biological replicates. 
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Figure 5. Synchronized infection with equal viral genome copies. RD 
cells were synchronously infected with 1000 genome copies of WT, Rl/ 
R2 CpG-high or R1/R2 UpA-high virus, as calculated using qRT-PCR. 
Cells were trypsinized and washed 1 or 4 h post infection and the intra- 
cellular viral load determined by qRT-PCR. Results are the mean and 
standard error of three biological replicates. 

was demonstrated to have more effect than in R2, as W|cu 
was rapidly outcompeted by cu|W as well as 
u|u. Consistent with the repHcation assays, reduction 
in CpG frequency showed a greater effect than that 
of UpA. 



Mechanism of dinucleotide-dependent replicative 
fitness differences 

The pathway(s) within the ceU responsible for the replica- 
tion phenotypes of viruses with different dinucleotide 
frequencies is unknown. Investigation of pathways 
concentrated primarily on measurement of differences 
between replication between WT and C|C and U|U 
(high) mutants, as these showed the greatest phenotypic 
differences in replication assays. Secondly, WT E7 
replicated rapidly to high titre in fibroblast ceU culture 
and was evidently highly effective at evading host re- 
sponses induced by infection. As described below, effective 
control in many aspects of both recognition and effector 
pathways precluded detection of phenotypic differences 
from CpG/UpA low mutants. 

Replication differences between WT and mutant with 
altered dinucleotide frequencies may arise through differ- 
ences in their susceptibility to IFNP-coupled cellular 
defences or through differences in their visibihty to 
pattern recognition receptors (PRRs) that activate the 
cellular defence response on entry. To investigate the 
former possibility, susceptibilities of WT and mutant 
viruses to exogenous IFNp were determined (Figure 9). 
Eight hours after infection, WT and mutant viruses 
showed dose-dependent attenuation of replication, with 
between 8- to 27-fold reductions in viral RNA levels 
relative to the mock-treated controls at the highest IFNp 
dose. While generally similar to WT virus, the replication 
of both C|C and U|U (high) mutants showed approxi- 
mately 2-fold greater susceptibility to IFNp than WT 
while the CpG/UpA-low mutant showed approximately 
2-fold greater resistance. However, these differences do 
not account for the 100- to 10000-fold impairments and 
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Figure 7. Fitness determination by competition assays between WT 
and modified viruses. RD cells were infected with an equal MOI of 
WT and modified virus, and the supernatant serially passaged through 
cells. RNA was isolated and the composition of each virus determined 
through selective restriction digests. Images show the virus composition 
in the starting inoculum and in three biological replicates following 
passage of CpG- and UpA-low mutants and of U|W. 



10-fold enhancements in replication respectively observed 
for these mutants (Figure 3). 

A second possibihty to explain replication phenotypes is 
that dinucleotide frequencies influence the susceptibility of 
viruses to RNA editing by the IFNP-induced pi 50 isoform 
of adenosine deaminase RNA-specific (ADARl) (24). A 
greater number of mutations in genomes of CpG- and 
UpA-high E7 mutants might therefore account for their 
greatly induced RNA to infectivity ratios (Figure 2). To 
investigate this, two 500 base regions of the wild type, C|C 
and U|U mutants were sequenced. Sequences from either 
the modified Region 1 (nucleotides 1835-2363) or an un- 
modified region (nucleotides 3723-3241) were amplified 
and individual population components sequenced by 
cloning the PCR product and sequencing individual 
clones. At least 14 clones were sequenced in each region 
for each virus (Supplementary Table S3). In wild-type E7, 
no mutations were observed in the total 15.5 kb 
sequenced. In the C|C virus, three synonymous single-nu- 
cleotide changes were observed in the total 15kb 
sequenced, while in U|U a total of 15.5 kb revealed one 
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Figure 8. Pairwise fitness comparison between E7 WT and mutant with 
varying degrees of CpG and/or UpA under-representation. RD cells 
were infected with an equal MOI of two viruses represented in 
columns and rows of the matrix and the supernatant serially 
passaged. The composition of each virus was determined through re- 
striction endonuclease cleavage (see Figure 7) and outcome displayed 
by colour shading. The key refers to population representations of 
viruses listed in columns (for example, all but one variants were out- 
competed by the cu|cu mutant). The fitness ranking deduced from these 
results is shown underneath the figure. 
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Figure 9. Effect of exogenous IFN-p on viral replication. RD cells 
were treated with 5, 50 and 500 U/ml human IFN-p, or mock 
treated, then infected with WT E7, C|C, U|U and cu|cu mutants at 
an MOI of 1 . RNA was isolated after 8 h and viral load determined by 
qRT-PCR. Results are the mean and standard error of two biological 
replicates. 

synonymous change and one U^C substitution, convert- 
ing a methionine residue into a threonine. Although the 
ainount of sequence data obtained was limited, the pre- 
dominantly synonymous mutations observed in C|C and 
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U|U mutants do not suggest any substantially greater sus- 
ceptibility to ADARl -mediated mutation, and this would 
be unlikely to be responsible for their dramatically 
impaired replication. 

Another possibiUty for the different phenotypes of E7 
dinucleotide mutants is that they differentially activate the 
IFNp response in infected cells. To investigate the cellular 
response to infection, the expression of 90 innate immune 
response genes was analysed following infection with E7, 
C|C or U|U viruses. WT E7 induced a minimal IFNp 
response at 8 h with transcript levels at the limit of detec- 
tion, a similarly unaltered level of the early interferon- 
stimulated genes (ISGs) 15 and 56, and small changes in 
expression of a total of six genes from the 90 immune 
response genes (Figure 10 and Supplementary Figure 
S2). Of these, five were up-regulated 2- to 5-fold and one 
was down-regulated. The response was quantitatively and 
quahtatively much weaker than in cells transfected with 
poly-I:C or infected with Sendai virus (Figure 10), where 
there were large increases in the expression of IFN-P and a 
range of other innate response genes. 

Similarly, inoculating cells with infectious doses of C|C 
and U|U mutants equivalent in RNA copy number to 
those of W|W (i.e. MOIs 100- to 1000-fold lower than 
WT) induced no IFNp or ISG responses and minimal 
changes in gene expression of other genes in the 
PGR array (>2- and < 5-fold induction of three and 
four genes, respectively; Figure 10 and Supplementary 
Figure S3). This demonstrated that RNA genomes with 
increased dinucleotide frequencies entering the cell equiva- 
lently to WT (see above) activated only a minimal cellular 
response. However, infecting cells with U|U and particu- 
larly C|G mutants at equal MOI to WT induced a higher 
IFNp response, which was also reflected by a greater 
number of changes in gene expression in the PGR array. 
Infecting with the G|G mutants led to 12 genes being 
up-regulated (and 5 down-regulated). 

To investigate potential signalling pathways that were 
differentially activated by viruses with differing dinucleo- 
tide frequencies, the replication of E7 and mutant viruses 
(G|G and U|U) was measured in cells in which expression 
of specific signalling component was inhibited. Interferon 
regulatory factor 3 (1RF3) is a central regulator of type I 
IFN induction in response to the detection of viral RNA 
in the cytoplasm or endosomes, and its signalUng is 
blocked by the viral N''™ protein from classical swine 
fever virus (25). Both mutant and WT viruses inoculated 
at equal MOIs replicated to higher levels in cells express- 
ing NP™ compared with the parental A549 cell fine, with 8- 
fold increases in WT virus and 3- to 4-fold in G|G and 
U|U (Figure 11). Replication of E7 is therefore at least 
partially controlled through activation of the IRF3 
signalHng pathway even though IFNp and other cellular 
responses are highly restricted. However, there was no 
evidence that IRF3-signalled inhibition of virus replica- 
tion was differentially activated by mutants with 
elevated GpG and UpA dinucleotide frequencies; fold in- 
creases in replication were actually less than those dis- 
played by WT E7. 

IRF3 is part of a signalling pathway that is activated 
upstream by the cytoplasmic dsRNA sensors RIG-I and 



MDA5. Another PRR that is potentially responsible for 
differential recognition of infecting viruses with differing 
dinucleotide compositions is PKR. To investigate their 
roles in recognition, rephcation of W|W, U|U and G|G 
mutants was determined in A549 cells expressing 
sliRNAs targeting each PRR (Supplementary FigureS3). 
Down-regulation of RIG-1 had no effect on the replica- 
tion of WT or mutant E7 variants, while the replication of 
G|G and U|U mutants (but not WT E7) was marginally 
increased (2-fold) in cells expressing the MDA-5 shRNA 
(Figure 12A). In contrast, down-regulation of PKR had a 
partial inhibitory effect on the replication of WT and U|U 
mutant E7 variants, but httle or no influence on G|G 
mutant replication. 

We investigated this latter observation further by in- 
hibiting PKR expression in RD cells by transfecting 
them with a PKR-specific siRNA to PKR. These down- 
regulated PKR mRNA and protein expression by 
>80%, irrespective of whether infected by E7 or not 
(Supplementary Figure S4). Under these conditions, the 
replication of WT and G|G, U|U mutants was consistently 
reduced compared with that in cells pre-treated with the 
control siRNA as was the replication of the cu|cu mutant, 
which we additionally tested (Figure 12B). 

Although specific knockdown of PKR expression by 
shRNA and siRNA typically inhibited replication of 
E7 and its mutants, treatment of cells with the kinase in- 
hibitor G16 dramatically enhanced virus replication of 
mutants with increased frequencies of GpG and UpA di- 
nucleotides (Figures 13 and 14). Treatment of cells with 
G16 led to 10- to 100-fold increased in the replication 
of G|G and U|U mutants, while the rephcation of WT 
and cu|cu (GpG/UpA-low) mutants was relatively un- 
affected (Figure 13). We extended this study to determine 
the cytopathology of selected mutants. Infecting cells with 
WT virus at an MOIs of 0.1 produced visible pathology 
after 22 h, and limited cytolysis at an MOI of O.OI 
(Figure 14A). This was unaffected by the addition of 
G16. In marked contrast, untreated cells infected with 
G|G and U|U mutants showed no cytopathology over 
the experimental interval while those treated with G16 
showed complete GPE at an MOI of 0.01. G16 treatment 
similarly increased the infectivity of G|G and U|U mutants 
(Figure 14B). Virus stocks containing 3000 PFU/ml, as 
measured previously on untreated RD cells, were re- 
titrated in a quantal infectivity assay with six replicated 
at serial 10-fold dilutions and infectious litres determined 
by pro bit analysis. G16 treatment had no effect on the 
infectivity of WT or cu|cu E7 variants, while infectivity 
was enhanced 10-fold (U|U) and nearly 1000-fold (G|G) 
in mutants with increased dinucleotide frequencies 
(Figure 14B). Using these revised infectivity titres, G|G 
and U|U mutants showed equivalent RNA/infectivity 
ratios to WT virus in the presence of G16 (Figure 2). 
The attenuating effect of increased GpG and UpA 
frequencies on E7 replication is thus entirely reversible 
by G16; the reduced infectivity of high GpG/UpA 
mutant stocks is clearly not because the virus is intrinsic- 
ally defective but because the cell is better able to prevent 
infection by these mutants than by WT virus. 
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Figure 11. Replication of WT and dinucleotide-modified viruses in an 
IRF3-blocked cell line. A549 cells expressing the viral protein N""" and 
non-expressing control cells were infected with WT, C|C (R1/R2 CpG- 
high) and U|U (R1/R2 UpA-liigh) at an MOI of 0.01. RNA was ex- 
tracted from cell culture supernatant at 24 h and viral load determined 
by qRT-PCR. Results are the mean and standard error of three biolo- 
gical replicates. 



Although CI 6 has been considered to be a specific in- 
hibitor of PKR (26), its effect on the replication of GpG- 
and UpA-high mutants was inconsistent with the lack of 
similar replication enhancement using sh- and siRNAs 
directed against PKR mRNA (Figure 12). The possibility 
of an off-target effect of CI 6 was further suggested by the 
lack of an equivalent affect on the replication of C|C and 
U|U mutants by 2-AP that also inhibits PKR phosphor- 
ylation (Figure 13). Documented additional targets of CI 6 
include the cyclin-dependent kinases, CDK2 and CDK5 
(27). To investigate whether these unintended actions of 
CI 6 were influencing E7 replication, we determined the 
effect of a well characterized inhibitor of cycUn kinases, 
roscovitine, on the replication of WT and mutant variants. 
Cells infected with WT and cu|cu E7 variants showed 
nearly 10-fold reduced E7 replication compared with un- 
treated cells while the replication of U|U replication was 
unchanged and C|C substantially enhanced (* 9-fold). As 
observed with CI 6 treatment, roscovitine induced visible 
cytopathology in C|C-infected cells while none was 
apparent in controls (data not shown). Increased C|C rep- 
lication occurred despite a general reduction in cellular 
gene expression in treated cells (including a 3-fold reduc- 
tion in the housekeeping gene, GAPDH). 

DISCUSSION 

Although decreases in virus fitness and replication by 
increasing CpG and UpA frequencies in RNA virus 
genomes has previously been described for poHovirus, 
this study is the first to demonstrate the converse, that 



decreasing frequencies of these dinucleotides significantly 
increases virus replication. This remarkable finding raises 
profound questions about the nature and direction of evo- 
lutionary selection operating in enteroviruses and poten- 
tially each of the many other RNA viruses that show 
similarly suppressed CpG and UpA frequencies (10-12). 
The study additionally demonstrates that while E7, along 
with other enteroviruses, is highly effective at suppressing 
IFNp induction and activation of host cell defences, cells 
are nevertheless able to recognize and restrict replication 
of mutants with differing dinucleotide frequencies. The 
findings point towards a novel mechanism for recognizing 
foreign RNA that prevents establishment of virus replica- 
tion at an early time point after virus entry. 

Relationship between virus replication and CpG and 
UpA frequencies 

Increasing the frequency of CpG or UpA dinucleotides in 
E7 resulted in severe viral attenuation, characterized by 
reductions in replication rate, smaller plaque areas, low 
particle to infectivity ratio and a low competitive fitness 
relative to WT E7. These findings are consistent with 
replication defects observed in poliovirus with elevated 
CpG and UpA frequencies in the capsid gene (17,28,29). 
The 2009 study by Burns et al. provided a convincing 
demonstration that the previously reported attenuating 
effect achieved by codon de-optimization (28) was second- 
ary and proportional to increases in CpG and UpA 
frequencies rather than a slowing of virus replication by in- 
efficient translation. Moreover, it was shown that changes 
in the capsid to conventional measures of translation effi- 
ciency (CAI, Enc, CPB) correlated poorly if at all with 
replication rate, in contrast to direct correlations with 
changes in CpG and UpA dinucleotide frequencies. 

Construction of mutants that increased bias towards 
unfavourable codon pairs (19) were remarkably effective 
at reducing poliovirus replication but again these also 
substantially increased CpG and UpA dinucleotide 
frequencies. In their study, however, the PV-Min mutant 
contained an observed to expected frequency of CpG of 
1.34 (compared with 0.63 of the WT sequence) and 1.36 
for UpA (WT: 0.74) that might similarly account for the 
observed differences in replication phenotypes rather than 
translation effects. These indeed approach levels in the 
CpG-high and UpA mutants constructed in the current 
study (Table 1). Contrastingly, their attenuation was in- 
consistent with the modest 50-70% measured reductions 
in translation efficiencies of the PV-MinXY and PV-MinZ 
mutants. Indeed, the failure to recover any infectious 
virus from the whole capsid mutant, PV-Min seems in- 
compatible with a measured 25% translation efficiency. 
A previous study in which translation efficiency of poUo- 
virus 5'UTR mutants was reduced to 12-23% of WT 
levels showed an approximately proportionate (1-1.5 
log) reduction in replication kinetics (30), quite different 
from the replication impairment observed in PV-MinXY, 
PV-MinZ and PV-Min. Together, these findings concur 
with the evident lack of correlation between CAI, Enc 
and CPB values in the E7 insert sequences (Table 1) and 
their replication phenotypes when inserted into E7. 
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Figure 12. Effect of inhibiting PKR, RIG-I and MDA-5 expression on E7 replication. (A) Comparison of virus replication in control A549 cells at 
8 li compared with that in cells expressing shRNAs targeting PKR, RIG-I and MDA-5 (reduction in mRNA levels achieved is shown in 
Supplementary Figure S3). (B) RD cells were pre-treated for 24 h with an siRNA directed against PKR and infected with WT virus or CpG- 
high, UpA-high or CPg/UpA-low mutants at an MOT of 0.03. RNA was isolated after 22 h and intracellular viral loads determined by qRT-PCR. 
Results are shown relative to the replication rate in cells treated with a control (validated non-target) siRNA. 



As examples, CAI values for CpG-zero sequences were 
almost identical to WT E7 while their insertion 
enhanced replication. CpG- and UpA-high sequences 
actually had higher CAI values yet showed an impaired 
replication phenotype. There was a similar lack of associ- 
ation with Enc and CPB measures and phenotype. 

Our findings imply a fundamental role of dinucleotides 
in replication abihty of enteroviruses and potentially other 
RNA viruses that show evidence for suppression of CpG 
and UpA frequencies. We would argue that their pheno- 
typic effects have to be included and allowed for in viral 
evolutionary studies such as those measuring often subtle 
effects of codon usage or codon order on virus replication. 
A case in point relates the concept of evolutionary 'robust- 
ness' of RNA viruses, based on the hypothesis that the 
evolutionary fitness of viral populations is dependent on 
shape of the fitness landscape around a consensus wild- 
type master sequence (31-33). A recent investigation 
of the effect of different fitness landscapes compared 
WT poliovirus with two mutants in which codon order, 
but not usage, was altered (34). These were constructed 
using capsid region sequences in which codon pair 
frequencies were skewed towards those most favoured in 
human coding sequences (PV-Max) and one where codon 



order was randomly permuted while retaining CPB 
(PV-SD). Importantly these had the same CAI and Enc 
values to WT sequences (and in the case of SD, an iden- 
tical CPB); both showed equivalent translation efficiencies 
in in vitro translation assays. 

As expected from their different positions in 'sequence 
space', the mutants showed different mutational spectra 
on passaging with Ribavirin. However what was unex- 
pected was their replication plienotypes, with PV-SD 
showing lower fitness but PV-Max greater fitness in com- 
petition assays with WT. The behaviour of PV-Max is 
particularly intriguing as, in common with PV-SD, there 
were hundreds of sequences changes from the WT 
sequence and likely far from an optimal 'robustness' as 
conceptualized in the study. We suggest a contributory 
factor or alternative explanation for the replication pheno- 
types was the unintended alteration of dinucleotide 
frequencies in the mutant sequences consequent to permu- 
tation of codon order. The PV-SD mutant (unhke CDLR 
in the current study) showed increased CpG and UpA 
frequencies from 0.60— > 0 88 and 0.8— >0.91 respectively, 
changes that are consistent with its observed modest 
fitness defect in competition against WT and lag in 
replication kinetics. Conversely, optimization for CP 
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Figure 13. Effect of the kinase inhibitors, CI 6, 2-AP and Roscovitine 
on E7 replication. Cells were pre-treated for 45min with 2\iM C16, 
5 mM 2-AP or 40 |jM Roscovitine and infected with WT virus or CpG- 
high, UpA-high or CpG/UpA-low mutants at an MOI of 0.1. RNA 
was isolated at 24 h, and intracellular viral loads were determined by 
qRT-PCR. Results are shown relative to the replication rate in cells 
untreated cells (note log scale). 



frequencies in PV-Max contrived to decrease CpG and 
UpA frequencies to 0.40 and 0.60 and whicli potentially 
account for its slight fitness advantage over WT in com- 
petition assays, although to a proportionately lesser extent 
than the cu|W and cu|cu E7 mutants in the current study. 
Our study additionally provides direct evidence against 
the hypothesis for robustness. The E7 codon permuted 
sequences (P|P) generated by the CDLR algorithm in 
the current study showed 251 sequence differences from 
WT yet preserved the exact dinucleotide frequencies of the 
WT sequence. Although clearly also residing in a quite 
different 'sequence space', it showed identical rephcation 
fitness to WT, being present in equimolar amounts to WT 
even after 10 passages in a competition assay format 
(Figure 7). 

Competition assays with WT E7 and a range of CpG- 
or UpA-low mutants in regions 1 and 2 showed a close 
correlation between fitness and numbers of CpG and UpA 
dinucleotides removed (Figure 8). The proposed fitness 
ranking additionally demonstrates that both dinucleotides 
participate in the fitness enhancement. Removal of CpG/ 
UpA dinucleotides in Region 1 had a greater fitness effect 
than Region 2. This hkely reflects natural compositional 
differences between the two regions; the region 1 WT 
sequence showed a observed/expected CpG frequency of 
0.73, over twice that of Region 2 (0.32). From the first 
region, 56 CpG dinucleotides were removed to generate 
the CpG-zero mutant, over twice that of Region 2 
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Figure 14. Influence of the PKR inhibitor E7 cytopathology and in- 
fectivity. (A) Cytopathology of WT, C|C and U|U mutants in RD cells 
infected at MOIs of 0.1 and 0.01 in 2)rM C16 or untreated control. 
Monolayers of untreated cells remain intact on infection with C|C and 
U|U mutants while those treated with C16 display a complete CPE 
even at low MOI. (B) Infectivity determination of pre-titred virus 
stock supernatant (3000 PFU/ml) of WT virus or CpG-high, UpA- 
high or CPg/UpA-low mutants in the presence of CI 6. The enhanced 
infectivity of C|C and U|U mutants restored the RNA/infectivity ratios 
to those of WT virus (Figure 2). 



(n = 21). Compositional differences also explain why 
Region 2 CpG-high mutants showed greater fitness reduc- 
tions that Region 1, as CpGs are already over-represented 
in the latter part of the genome. 

As a possible contributory factor to the otherwise un- 
explained greater replication of U|W compared with WT 
(Figure 3A andC), generating this mutant led to 12 CpG 
dinucleotides being removed from Region 1 (Table 1). 
Based on current understanding from the phenotypes of 
other mutants, this seems insufficient to fully account for 
the fitness gain. However, we know little about whether 
specific thresholds for dinucleotide removal exist; specific- 
ally, are fitness gains associated strictly proportional to the 
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number of CpG and UpA dinucleotides removed, or could 
relatively small reductions in frequency display the full 
phenotypic effect? As a broader question that relates to 
thresholds, does removing (or adding) CpG and UpA di- 
nucleotides in two 1000- to 1200-nt regions have the same 
effect on replication as adding/removing the same number 
over the whole genome. Whether extending the region of 
dinucleotide removal to the rest of the genome (excluding 
critical replication determinants) would lead to further 
proportionate increases in replication is similarly uncer- 
tain and is an obvious priority in further studies. 

Until we know more about the sequence motifs respon- 
sible for the differential recognition of sequences of differ- 
ent dinucleotide frequencies, these questions will remain 
unresolved. However, the existence of thresholds can be 
determined through fitness determination of mutants with 
differing degrees of CpG or UpA removal. These investi- 
gations are currently planned and will necessitate a more 
sophisticated mutagenesis algorithm that introduces add- 
itional mutations to compensate for those introduced to 
add or remove targeted dinucleotides. The method can 
maintain exact mononucleotide frequencies and protein 
coding and, to the greatest allowable extent, frequencies 
of non-targeted dinucleotides. 

Finally, changes in CpG frequency showed a greater 
effect on viral replication than changes in UpA levels, 
being both more beneficial to replication when lowered, 
and more detrimental when raised. When competed 
directly, the double region CpG-low mutants were fitter 
than their UpA-low mutant counterparts. These findings 
are consistent with those in poliovirus where CpG-high 
mutants also exhibited a more severe attenuation than 
UpA-high mutants (17); selection to remove CpG di- 
nucleotides was similarly substantially greater than 
against UpA during serial passage of mutated viruses 
(28). Indeed, dissimilar patterns of CpG and UpA sup- 
pression amongst viruses and host genomes (10,11,31) 
point to different selective pressures acting upon each 
dinucleotide. 

Irrespective of the underlying mechanism by which 
removal of CpG and UpA enhances RNA virus rephca- 
tion, the findings may potentially be exploited for 
enhancing replication of viruses or other genomes 
recovered by reverse genetics for research, therapeutic, 
vaccine or other purposes. Changes in the luciferase 
gene alone enhanced reporter gene expression in the E7 
replicon by > 10-fold even though constructs of this type 
containing luciferase and other reporter with similarly 
elevated CpG frequencies have been in use for many 
years without any idea that their replication or reporter 
gene expression was potentially suppressed. High CpG 
frequencies may similarly limit the expression of trans- 
genes and DNA vaccines (35); as with codon pair manipu- 
lation, the observed enhancement of expression of 
mammahan 'codon-optimized' versions of reporter gene 
derived from non-vertebrates may indeed be secondary 
to reductions in CpG frequencies. Finally, large scale 
removal of CpG and UpA dinucleotides from viruses 
used as seed stocks for inactivated virus production 
(including influenza A and B viruses and pohovirus) 
may increase cell or egg culture virus yields substantially. 



Experimental investigation of green fluorescent protein 
transgene expression in mouse embryos and of influenza 
production in cell and egg culture using synthetic se- 
quences in which CpG and UpA have been largely 
eliminated from coding sequences is currently underway. 

Mechanism of CpG and UpA recognition 

In common with other RNA viruses infecting mammals 
and birds, enteroviruses and other picornaviruses have 
evolved numerous strategies to evade the antiviral effect 
of the innate host response to infection. The effectiveness 
of E7 in blocking these cellular response pathways is 
attested by the virtual absence of IFNp niRNA synthesis 
on infection with high MOIs of WT virus and the minimal 
up-regulation of other innate immune response genes 
(Figure 10 and Supplementary Figure S2). Evasion 
strategies hkely resemble those of other enteroviruses 
that are known to deploy a range of signalling blocks 
mediated through proteolytic cleavage or inhibition of 
signalhng by RIG-I, MDA-5, TIR-domain-containing 
adapter-inducing interferon- p (TRIE), mitochondrial 
antiviral signahng protein (MAVS) and the IFNp 
receptor, IFNARl (36-39), typically by the virally 
encoded proteinases 2A and 3C. The enterovirus 2A pro- 
teinase additionally cleaves the translation initiation 
factor, eIF4G and turns off cap-dependent translation of 
cellular mRNAs (40). These strategies exert a powerful 
blunting effect on innate as well as acquired immune re- 
sponses in the host, as exemplified in hepatitis A virus 
through cleavage of both TRIE and MAVS and preven- 
tion of both TLR- and cytoplasmic PRR signalhng (41). 

Because E7 effectively paralyses lENP-coupled cellular 
responses in infected cells, it is most unlikely that the dif- 
ferential replication abihty of E7 mutants with altered di- 
nucleotide frequencies is mediated through differences 
in susceptibility to lENP-coupled defence pathways. 
Observationally both WT and CpG- and UpA-high 
mutants were similarly susceptible to exogenously added 
lEN (Figure 9) and E7 mutants showed httle evidence for 
greater RNA editing by the IFNP-activated pi 50 isoform 
of ADAR-1 (24) that might otherwise have potentially 
accounted for their marked differences in RNA/infectivity 
ratios (Figure 2). The restoration of specific infectivity of 
C|C and U|U variants by CI 6 to WT levels (Figure 2) 
indeed demonstrates that these mutant viruses are not in- 
trinsically defective. 

Differences in viral fitness are therefore likely to arise 
due to differential activation of the host innate immune 
system, a hypothesis supported by the observation of 
almost complete attenuation of replication very early 
after infection of RD cells with C|C and U|U mutants. 
These were able to enter ceUs within an hour but subse- 
quently either failed to initiate replication or showed dra- 
matically reduced genome amplification compared to WT 
E7 (Figure 5). Candidate PRRs that conventionally rec- 
ognise infection by RNA viruses include RIG-I and 
MDA5 although ascribing a protective role against E7 is 
problematic. For the former, recognition requires RNA 
sequences containing 5' triphosphate groups while 
MDA5 requires long dsRNA sequences created during 
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virus replication (42^4); neither of these PAMPs are 
present immediately after virus entry. Furthermore, their 
signalhng is hkely to be substantially inhibited by viral 
proteinases that cleave MAVS, TRIF and potentially 
also the PRRs themselves (36-39). Furthermore, neither 
RIG-I nor MDA5 seem capable of inducing responses in 
the absence of virus replication and synthesis of dsRNA 
replication intermediates (45). Finally, inhibition of RIG-1 
and MDA5 expression led to, at best, minimal increases in 
E7 replication (Figure 12). This strongly suggests that 
neither of these conventional PRRs are primarily respon- 
sible for the block on replication of high CpG or UpA 
mutants. 

Remarkably, however, treatment of cells with CI 6, gen- 
erally considered to be a PKR-specific inhibitor, led to 
substantially enhanced replication of the CpG-high 
and UpA-high mutants (~90- and 10-fold respectively 
greater viral loads compared with untreated cells; 
Figure 13). Furthermore, parallel increases in the infectiv- 
ity of titred C|C and U|U stocks (Figure 14B) entirely 
reversed the phenotypic effect of adding CpG and UpA 
dinucleotides to their genomes and restored RNA to in- 
fectivity ratios of these mutants to wild-type levels 
(Figure 2). Although we initially suspected a role of 
PKR in mediating the differential cellular response to E7 
variants of different dinucleotide composition, several 
competing observations argue against this hypothesis. 
Firstly, other inhibitors of PKR expression or function 
failed to show a comparable enhancement of C|C and 
U|U replication, including a shRNA and a siRNA 
directed against PKR and the PKR inhibitor, 2-AP 
(Figures 12 and 13). Secondly, another kinase inhibitor, 
roscovitine, also enhanced the replication of C|C and U|U 
(and strongly suppressed WT virus). This is despite having 
no identified inhibitory activity against PKR and being 
largely specific for the cell cycle regulation proteins, 
CDKl, CDK2, CDK5, CDK7 and CDK9 (46,47). 
Because C16 has also been shown to also inhibit many 
of these kinases (27), one might conjecture that the differ- 
ent rephcation abilities of the E7 mutants may be primar- 
ily determined by some aspect of cell cycle regulation. 

Replication of at least some enteroviruses is maximally 
productive at the Gl/S stage of the cell cycle (48,49), 
with very recent evidence that nuclear translocation 
of enteroviral proteins can prolong the S phase and 
enhance productive virus replication (50). Roscovitine 
indeed has the effect of arresting cell division in late Gl 
through inhibition of cdk2/cychn E and cdk2/cycle A (46). 
However, why E7 mutants with increased dinucleotide 
frequencies are more sensitive to ceU cycle stage than 
WT virus would need to be explained. This model also 
does not explain why roscovitine and C16 actually in- 
hibited the replication of WT and cu|cu mutants. Their 
inhibition of cdk2 and cdk5 should arrest ceUs in the 
most favourable stage in the cell division cycle for virus 
replication. 

An intriguing alternative possibility is that there exist 
as yet uncharacterized homologues of stress response 
proteins that are inhibited by CI 6 and other kinase 
inhibitors. Four proteins are currently known to 
induce stress responses through a common pathway by 



phosphorylation of eIF2a (51,52). These proteins possess 
homologous kinases but distinct recognition domains that 
are activated by different stress factors in the endoplasmic 
reticulum, including dsRNA (PKR), mis- or unfolded 
proteins (PERK) and low amino acid concentrations 
(GCN2). Perhaps there exist one or more additional 
members of this group that activate the stress response 
pathway with further recognition motifs (high CpG or 
UpA RNA) and which are also inhibited by C16 (52). 
Investigation of cellular binding partners of transfected 
RNA of different dinucleotide compositions represents a 
promising future approach to identify what proteins are 
involved in this recognition process. 

The evolutionary basis for dinucleotide frequency biases 
in RNA viruses 

Although the current study clearly demonstrates the influ- 
ence of both CpG and UpA frequencies on virus replica- 
tion in ceU culture, it invites as many questions as it 
answers. The over-riding question is why RNA viruses, 
with their extraordinary adaptive abilities, have not 
evolved further to reduce CpG and UpA frequencies to 
levels lower than they currently are, given the evident rep- 
lication advantage this provides at least in cell culture 
(Figures 3, 4, and 6-8). The obvious answer must be 
that to do so would impair their fitness in their natural 
hosts through immune response mechanisms not repre- 
sented in fibroblast cell culture. This may include inter- 
actions with specialized antigen-presenting cells and 
induction of inflammatory and adaptive immune re- 
sponses that primarily drive the systemic response to 
and eventual outcomes of virus infections. Alternatively, 
enhanced replication may somehow impair transmission 
kinetics in the host population. We have estabhshed 
equivalent panels of high- and low CpG/UpA mutants 
of Theiler's murine encephalomyelitis virus and HlNl in- 
fluenza A virus for infection of mice to investigate the 
effect of dinucleotide frequency changes on in vivo 
fitness. The key question to explore is the behaviour of 
mutants with lowered dinucleotide frequencies, whether 
they achieve greater or lesser degrees of replication 
in vivo and separately whether they are attenuated or 
show enhanced pathogenicity. Although not formally 
investigated to date, if we take the previously described 
PV-Max mutant as an example of a virus with suppressed 
CpG and UpA frequencies (34), the observed threefold 
greater viral titres in brain of peripherally inoculated 
mice seems to imply that their suppression confers an 
in vivo fitness advantage. 

Further in vivo studies are clearly required to place di- 
nucleotide frequencies into an appropriate evolutionary 
and immunological framework. There is an additional 
need to further characterize effects of other nucleotide 
and dinucleotide compositional abnormalities of RNA 
virus genomes on cellular response. This apphes in the 
current study to the unexpected enhanced replication of 
the U|W E7 mutant (Figure 3) that contrasted with the 
otherwise primarily dose-dependent relationship between 
dinucleotide composition and rephcation ability in the 
current study (Figures 3 and 4) and in previous mutational 
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analyses of poliovirus (17). Recent investigations of the 
effect of biased mononucleotide compositions of human 
immunodeficiency virus type 1 on cellular responses 
provide evidence for further complexities in the inter- 
action between the viral genomic RNA sequences and 
host defences (53,54). 

A wide range of other issues remain to be resolved. 
While changing UpA frequencies mirrored at least in 
part phenotypes induced by altering CpG frequencies, it 
remains unclear whether they share recognition mechan- 
isms and/or susceptibilities to cellular responses. For 
example, UpA (along with UpU) but not CpG is a 
preferred target for IFN-induced RNAseL (55,56) and 
plays a determining role in niRNA turnover rates (57). 
Replication of high UpA mutants was enhanced less by 
C16 (Figures 13 and 14) providing some evidence that 
cellular recognition mechanisms underlying the replication 
phenotypes of U|U and C|C mutants were distinct. 
Similarly difficult to account for is the observation that 
CpG (and UpA) frequencies are markedly suppressed in 
plant viruses even though viral defence mechanisms in 
plants are based on entirely different recognition, 
signalling and effector pathways from those in mammals. 
Uncovering the nature of selection against these dinucleo- 
tides in plants will be highly informative in further under- 
standing evolutionary processes that govern dinucleotide 
compositional constraints in a wider eukaryotic paradigm. 
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